(Click to expand each section)
Concept | Definition | Key Points / Explanation |
---|---|---|
Cloud Computing (NIST Definition) | The delivery of a shared pool of on-demand computing resources (e.g., servers, storage, networking, databases) over the public internet, provisioned with minimal effort or provider interaction. |
|
On-Demand Self-Service | Ability to provision computing resources automatically, without direct human interaction from the service provider. |
|
Broad Network Access | Services are available over the network and accessible through standard mechanisms (e.g., web browsers, mobile, APIs). |
|
Resource Pooling | Provider’s computing resources serve multiple customers via a multi-tenant model, while ensuring data privacy and security for each. |
|
Rapid Elasticity | The ability to scale resources up or down—often automatically—to match demand. |
|
Measured Service | Resource usage is monitored, measured, and reported, enabling a pay-for-use model. |
|
Concept | Definition | Key Points / Explanation |
---|---|---|
Public Cloud | Cloud services offered by third-party providers over the public internet. |
|
Multi-Cloud | Using multiple public clouds together (e.g., GCP + AWS + Azure) as part of one environment or strategy. |
|
Private Cloud | An on-premises cloud dedicated to a single organization, still meeting the five essential cloud characteristics. |
|
Hybrid Cloud | A single “cloud-like” environment that combines private cloud and public cloud. |
|
Service Model | Definition | Key Points / Explanation |
---|---|---|
IaaS (Infrastructure as a Service) | The vendor abstracts and manages the underlying data center, networking, servers, storage, and virtualization layers. You manage OS, runtime, apps, data. |
|
PaaS (Platform as a Service) | The vendor abstracts data center, networking, servers, storage, virtualization, and the runtime/operating system. You manage the application and data. |
|
SaaS (Software as a Service) | The vendor handles everything, delivering the software to you as a web or API-based service. |
|
XaaS (Anything as a Service) | An umbrella term for any service delivered over the cloud (e.g., FaaS, CaaS, DBaaS). |
|
Concept | Definition | Key Points / Explanation |
---|---|---|
Private Global Network | Google’s private high-bandwidth, low-latency network that interconnects its data centers worldwide. |
- Traffic typically remains on Google’s private backbone, ensuring high performance and security. - Includes extensive fiber, points of presence (PoPs), and subsea cables across continents. |
Regions | Independent geographic areas that contain multiple zones. |
- Each region can have several zones (usually 3+). - Inter-zone latency within a region is typically under 5 ms. - Deploying services across multiple zones in a region improves fault tolerance. |
Zones | The smallest deployment entity within a region, acting as an isolated failure domain. |
- Resources (like Compute Engine instances) live in a specific zone. - Redundant design: if one zone goes down, others remain unaffected. |
Multi-Regions | Large geographic areas containing multiple regions. |
- Used for maximum redundancy, distribution, or availability. - Data is stored and replicated across multiple regions within the multi-region. |
Points of Presence (PoP) | Edge locations or network entry points where traffic enters/exits Google’s backbone. |
- Optimizes latency by routing requests to the nearest PoP. - Also known as Google’s “edge network.” |
Subsea Cables | High-capacity undersea fiber cables connecting continents. |
- Google invests heavily in private subsea cables. - Enables fast, low-latency connectivity between major geographic areas. |
Service | Service Model | Definition | Key Points / Explanation |
---|---|---|---|
Compute Engine | IaaS (Infrastructure as a Service) | Virtual machines (VMs) running in Google’s data centers. |
- Complete control over OS, software, libraries. - You manage patching, scaling (auto-scaling possible with instance groups). - Supports custom or public images, plus marketplace solutions. |
Google Kubernetes Engine (GKE) | CaaS (Container as a Service) | Managed Kubernetes environment for container orchestration. |
- Runs on top of Compute Engine instances as worker nodes. - Automates container deployment, scaling, and management. - Based on open-source Kubernetes, so it’s portable between on-prem/other clouds if also running Kubernetes. |
App Engine | PaaS (Platform as a Service) | Fully managed platform for building and hosting web apps. |
- Auto-scales based on traffic. - Abstracts OS management, security updates, runtime patching. - Supports common languages (Java, Python, Go, Node.js, etc.) and custom runtimes. |
Cloud Functions | FaaS (Function as a Service) | Serverless environment to run short-lived functions triggered by events. |
- No server management; pay only for execution time. - Integrates well with other GCP services (e.g., Cloud Storage triggers, Pub/Sub, etc.). - Good for lightweight, event-driven microservices, data processing, and real-time event handling. |
Cloud Run | Serverless (also considered FaaS/CaaS) | Fully managed compute for containerized apps, built on KNative. |
- Deploy any container with your choice of language, runtime, or libraries. - Scales to zero when no traffic; scales up instantly on demand. - Often described as “serverless containers.” |
Service | Type | Definition | Key Points / Explanation |
---|---|---|---|
Cloud Storage | Object Storage | Scalable, durable, highly available object storage (documents, images, backups, etc.). |
- 11 “nines” of durability (99.999999999%). - Multiple storage classes (Standard, Nearline, Coldline, Archive) for cost optimization based on access frequency. - Availability options: Regional, Dual-Regional, or Multi-Regional. |
Filestore | File Storage | Fully managed NFS (Network File System) for sharing files across multiple Compute Engine VMs or GKE. |
- NFS v3 compliant. - Useful when multiple VMs/containers need concurrent read/write access to the same shared file system. |
Persistent Disk | Block Storage | Durable block storage volumes for Compute Engine (VM) instances. |
- Attached to a single VM for OS/data disk use. - Available as HDD (Standard) or SSD for higher IOPS, lower latency. - Zonal or regional replication options. Disk Types in GCP • Persistent Disks: They’re like a safe box that keeps your data even when your VM is off, available as slower standard disks or faster SSDs. • Local SSDs: These are super-fast disks attached right to your VM, but they lose their data when the VM stops. • Boot Disks: A kind of persistent disk that holds the operating system so your VM can start and run. |
Service | Type | Definition | Key Points / Explanation |
---|---|---|---|
Cloud SQL | Relational (SQL) | Fully managed SQL database (MySQL, PostgreSQL, SQL Server). |
- Automated backups, replication, patching, scaling. - Zonal high-availability; can set up cross-region read replicas. |
Cloud Spanner | Relational (SQL) | Horizontally scalable, strongly consistent, globally distributed relational database. |
- Handles high transaction volume with strong consistency. - Multi-region or even global replication. - Used for mission-critical apps needing global scale and ACID transactions. |
Bigtable | NoSQL (Wide-Column) | Fully managed, high-throughput, low-latency NoSQL database (inspired by Google’s internal Bigtable system). |
- Good for large analytic or operational workloads (e.g., IoT, time-series data). - Scales to petabytes. - Cluster resizing without downtime. |
Datastore / Firestore | NoSQL (Document) | Schemaless or document-style databases often used for web/mobile/IoT apps. |
- Datastore: multi-region replication, ACID transactions. - Firestore: near-real-time updates, designed for offline use, easy integration with Firebase for mobile. - Both scale automatically and can handle millions of reads/writes. |
Memorystore | In-Memory Cache | Fully managed Redis or Memcached in-memory data store. |
- Low latency caching layer for frequently accessed data. - Helps scale read performance in high-traffic scenarios. Memorystore/Redis is a managed in-memory cache that multiple applications can share for rapid data access, whereas local SSDs are high-speed, VM-attached disks used for fast, temporary storage that lose data when the VM stops. |
Concept / Service | Definition | Key Points / Explanation |
---|---|---|
VPC (Virtual Private Cloud) | A virtualized global network that manages networking for GCP resources. |
- Acts like your “virtual data center.” - Global scope spans all GCP regions. - You can create subnets per region, control IP ranges, and segment networks. - Each project has a default VPC; you can create additional VPCs if needed. |
Firewall Rules | Control inbound/outbound traffic at the instance level, globally distributed. |
- Defaults exist (allow internal traffic, SSH, etc.), and you can define custom rules. - Stateful firewall; traffic is allowed or denied based on rules. |
Routes | Specify how traffic leaves an instance and gets routed to other destinations. |
- Default route to the internet, plus additional routes for custom routing scenarios. - Work with firewall rules to manage traffic flow. |
Load Balancing | Distributes inbound traffic across multiple backends/instances to handle workloads efficiently. |
- HTTP/HTTPS Load Balancing: Global, Layer 7 load balancing with content-based routing. - Network Load Balancing: Regional, Layer 4 load balancing for TCP/UDP traffic. |
Cloud DNS | Google’s managed DNS service, using the same infrastructure as Google’s own DNS. |
- Create/maintain DNS records (A, AAAA, MX, CNAME, TXT, etc.) in managed zones. - Low-latency, high-availability DNS resolution. |
Cloud VPN | Secure IPsec connection between on-premises network and GCP VPC over the public internet. |
- Encrypted traffic using VPN tunnels. - Ideal for lower-volume or basic hybrid scenarios. |
Direct Interconnect | Dedicated high-speed, private connection from on-premises data center to GCP, bypassing the public internet. |
- Provides low latency, high availability. - Suited for large data transfers, stable throughput needs. |
Peering (Direct/Carrier) | Connects your network to Google’s edge through a peering exchange or via a carrier partner. |
- Direct peering: exchange traffic directly with Google at a peering facility. - Carrier peering: traffic flows to GCP through a partner’s network. |
Concept | Definition / Explanation |
---|---|
Resource | Any entity you use in GCP, e.g., Compute Engine VMs, Cloud Storage buckets, Cloud SQL instances, plus higher-level “account” resources (projects, folders, organization). |
Resource Hierarchy | A structure (organization → folders → projects → service-level resources) for organizing and managing cloud resources. |
Parent-Child Relationship | Policies/permissions set at a parent resource are inherited by its children. Each child has exactly one parent. |
Organization Node | The root of the hierarchy, associated with one G Suite/Cloud Identity domain. IAM policies set at this level apply across all folders/projects/resources. |
Folders | An optional grouping mechanism between organization and projects (e.g., by department or environment). Must have an organization node to use folders. Each folder can contain multiple child folders or projects, but any folder/project has exactly one parent. |
Projects | The required base-level grouping entity for using GCP services; all service resources belong to a single project. A project can only belong to one folder or the organization if no folders are used. |
Service-Level Resources | Actual resources you create (VMs, buckets, databases, etc.). They sit at the bottom of the hierarchy, inside a project. |
Labels | Key-value pairs that help organize and filter resources, especially for cost tracking. |
IAM Policy Inheritance | Setting an IAM policy at one level automatically applies that policy to child objects. E.g., a role assigned at the folder level flows down to all projects/resources under that folder. |
Step / Concept | Definition / Explanation |
---|---|
Free Tier |
- 12-month free trial. - $300 USD credit (or local currency equivalent) to explore GCP. - Personal (individual) account only, not a business account. - Ends when credits are used or 12 months pass. |
Always Free |
- Set of services/resources GCP provides at no cost indefinitely (within usage limits). - Available on any upgraded GCP account (i.e., after you’ve added billing details). - If usage goes beyond the free limits, standard billing applies. |
Requirements |
- A new Gmail address (to avoid conflicts). - A valid credit/debit card (for verification). - (Optional) Use Incognito/Private Browser session to avoid accidental account collisions. |
Steps to Create |
1. Go to the Free Trial URL (cloud.google.com/free). 2. Create a new Google/Gmail account if needed. 3. Provide credit card info for identity verification (no charges unless you exceed credits). 4. Accept terms to start the free trial. |
Verification & Setup |
- Google may send you a phone verification code. - Once complete, you see your free trial credit displayed in the GCP Console’s billing section. |
Concept | Definition / Explanation |
---|---|
Two-Step Verification | An extra layer of account security requiring something you know (your password) and something you have (phone/security key). If your password is compromised, attackers still need physical access to the second factor. |
Verification Methods |
- Text/Voice Call: Google sends a code via SMS or phone. - Authenticator App: Generates one-time codes (e.g., Google Authenticator). - Google Prompt: Approve sign-ins via push notification. - Security Key: Physical USB/NFC key for strong protection against phishing. |
Backup Codes | One-time codes you can download/print in case your phone/security key is unavailable. |
Best Practices |
- Always enable multi-factor auth (MFA) for all admin and personal GCP accounts. - Use push notifications or security keys if possible (fewer SIM-swap risks). |
Feature | Definition / Explanation |
---|---|
Console Home Page | - Displays summary “cards” about recent activity, project info, billing, etc. - You can customize which cards appear. |
Navigation Menu | - “Hamburger” menu in the top-left corner gives access to all GCP services grouped by category (Compute, Networking, etc.). - You can “pin” frequently used services to the top for quick access. |
Search Bar | - Quickly locate services, APIs, or even specific resources by name. |
Project Selector | - Choose your active project in the top menu. Project-level resources (VMs, Cloud Functions, etc.) are accessed only under the currently selected project. |
Console Top Bar | - Activate Cloud Shell icon for terminal access to GCP resources. - Notifications (bell icon) for events/logs. - Help (question mark icon) for docs/keyboard shortcuts. |
Activity & Recommendations |
- Activity tab shows recent actions (creating resources, changing IAM, etc.). - Recommendations shows cost or performance suggestions from GCP’s Recommender service (e.g., right-sizing VMs). |
Concept | Definition / Explanation |
---|---|
Cloud Billing Account | Tracks costs for GCP usage. Linked to at least one payment method (credit card or bank account). Can pay for multiple projects. |
Payments Profile | A Google-level resource storing payment methods, billing contacts, and legal information. Used across Google services (not just GCP). |
Billing Account Types |
- Self-Service (Online): Credit/debit card auto-charged, invoices visible online. - Invoiced (Offline): Must qualify for invoice billing; pay via check/wire transfer, and receive monthly invoices by mail or electronically. |
Sub-Accounts | Used by resellers to group charges (e.g., multiple customers) on a separate section of the invoice. Linked to a master billing account. |
Ownership & Linking | - A single organization owns a billing account (though it can pay for projects in different orgs). - A project without an attached billing account can only use GCP’s free services (limited usage). |
Roles & Permissions | Billing access is controlled by IAM roles, e.g., Billing Account Administrator, Billing Account Creator, Billing Account User, etc. |
Creating/Editing/Closing |
- You can create new billing accounts (must have Billing Account Creator role). - Link/unlink projects (Project Billing Manager + appropriate billing roles). - Close billing accounts after detaching all projects. |
Concept | Definition / Explanation |
---|---|
Committed Use Discounts | You commit to a specific level of resources/usage for 1-3 years, in exchange for reduced hourly rates on those resources. |
Resource-Based Commitments | Commit to a certain amount of vCPUs, memory, GPUs, etc. in a particular region for Compute Engine. Discounts up to 57% for most machine types (up to 70% for memory-optimized). |
Spend-Based Commitments | Commit to a spending level ($ per hour) for specific services, such as Cloud SQL or VMware Engine. Discounts up to ~25-52% depending on 1-year or 3-year. |
Sustained Use Discounts | Automatic discounts for Compute Engine when you run resources (general purpose/memory-optimized VMs) for a substantial portion of the month (25%+). Scales up to 30% max discount. |
GCP Pricing Calculator | Web tool to estimate monthly costs for a planned architecture. Helps forecast spend in advance (cloud.google.com/products/calculator/). |
Budgets & Budget Alerts |
- Define a budget amount for a billing account or specific projects. - Set thresholds to trigger email alerts (50%, 90%, 100%, etc.) of your budget. - By default, emails go to Billing Admins/Users. You can configure additional recipients via Cloud Monitoring or integrate with Pub/Sub for automated responses. |
Pub/Sub Integration | Programmatic notifications when budgets exceed thresholds. Example automations: shut down expensive resources, push Slack alerts, or freeze new deployments. |
Reservations | Reserve (and guarantee availability of) certain VM resources (e.g., a certain number of cores/CPUs in a region). Pairs well with committed use discounts for consistent, predictable workloads. |
Concept | Definition / Explanation |
---|---|
Billing Export to BigQuery | Automatically export granular billing data (usage cost details, pricing) from GCP to a BigQuery dataset. |
Daily Cost Detail | Exports daily usage and costs at a detailed level. |
Pricing Data | Optionally exports GCP’s list pricing information to BigQuery. |
Use Cases | - Analyze spend in BigQuery or visualize via tools like Looker Studio (Data Studio). - Helps with cost optimization, trend analysis, and custom dashboards. |
Important Note | Billing export is not retroactive; data is collected only after you enable this feature. |
Setup Steps | 1. Create or choose a BigQuery dataset. 2. Enable billing export in GCP Console (link dataset). 3. Enable the BigQuery Data Transfer Service API. 4. Data is updated daily (for cost detail). |
Concept | Definition / Explanation |
---|---|
Cloud APIs | The set of GCP service endpoints that let you programmatically control and integrate GCP resources (e.g. Compute Engine API, BigQuery API, etc.). |
Enable an API | You must enable each API at the project level before you can use it (via Console, gcloud CLI, or Service Usage API). |
API Library | Console section listing available GCP APIs. Allows quick enabling/disabling. |
Monitoring / Quotas | API usage can be tracked in the API Dashboard (IAM & Admin → Quotas). Quotas help prevent excessive usage. |
Automation | Accessing Cloud APIs directly allows you to script or code solutions in your preferred language. gcloud/Console also use these APIs under the hood. |
Concept | Definition / Explanation |
---|---|
Super Admin Account | - Exists in G Suite / Cloud Identity setups. - Has irrevocable org-level permissions (can grant Organization Admin, etc.). - Should not be used for daily tasks (principle of least privilege). |
Personal Gmail Approach | - If you do not have G Suite / Cloud Identity, you may use personal Gmail accounts as separate “admin” or “user” accounts. - Each standalone Gmail account can have distinct billing & IAM roles. |
Billing Account Admin vs. User | - Billing Account Admin: Full control over billing (budgets, payment methods, etc.). - Billing Account User: Can link/unlink projects to the billing account but cannot modify payment methods or budgets. |
Principle of Least Privilege | Assign only the minimum required roles (e.g., a second user might only need Billing Account User if they just need to attach projects to billing). |
Steps to Add | 1. In Console, go to Billing → Account Management. 2. Add the new user’s email, select the appropriate role (e.g., Billing Account User). 3. New user can log in, create projects, attach to that billing account, etc. |
Concept | Definition / Explanation |
---|---|
Cloud SDK | A set of command-line tools (primarily gcloud , gsutil , bq ) for managing GCP resources. |
gcloud CLI | Main CLI tool for GCP. Allows you to create, update, delete resources (VMs, networks, etc.), manage IAM, billing, etc. |
User vs. Service Account Auth |
- User Account: Tied to an individual’s Google identity. Good for interactive use on a single machine. - Service Account: Tied to a service identity. Often used for automation (scripts, CI/CD). |
Key Commands |
- gcloud init : Initialize & authorize the SDK; creates a configuration.- gcloud auth login : Authorize using user credentials.- gcloud config : Manage configurations (set account, project, zone, etc.).- gcloud components : Install/update additional CLI components.
|
Command Format | gcloud [component] [entity] [operation] [arguments] [flags] (e.g., gcloud compute instances create … ). |
Concept | Definition / Explanation |
---|---|
Multiple Configurations | You can create named “profiles” of gcloud settings (e.g., default , master ) to handle different accounts/projects. Switch with gcloud config configurations activate <NAME> . |
Auth & Accounts |
- gcloud auth list : List all authorized accounts and shows which one is active.- gcloud auth revoke <ACCOUNT> : Remove credentials for an account.
|
Set Config Properties | gcloud config set <property> <value> (e.g., gcloud config set project my-project ). Applies to the currently active configuration. |
Components |
- gcloud components install <component> : Install optional tools (e.g., kubectl ).- gcloud components list : See available components.- gcloud components update : Update all installed components to the latest version.
|
Interactive Shell (beta) | gcloud beta interactive provides inline autocompletion, hints, and command documentation. |
Info & Logs | gcloud info : Shows details about your SDK installation, project, active account, and config location. |
Concept | Definition / Explanation |
---|---|
Cloud Shell | A browser-based, ephemeral VM with the Cloud SDK & other dev tools preinstalled (git, Docker, Kubernetes tools, etc.). Authenticated as your user automatically. |
Persistent Home Directory (5 GB) | Each user gets 5 GB of persistent storage mounted to /home/<user> . Data remains intact across sessions but sessions themselves are ephemeral. After 1 hour of inactivity, the VM is reclaimed, but files in /home persist. |
Auto-Upgrade | Cloud Shell’s SDK components are updated weekly. |
Code Editor | Integrated via Eclipse Theia. Allows browsing, editing files in your Cloud Shell environment. |
Web Preview | Lets you preview web apps running in Cloud Shell on a secure proxy (ports typically 8080 or 8081). Accessible only to your logged-in user. |
Customization |
- You can auto-install extra tools by creating .customize_environment in your home directory.- Boot script runs at session startup (e.g., to install Terraform, Helm, etc.). |
Quota & Limits |
- 50 hours/week usage limit. - If idle for 1 hour, session is terminated. - If you don’t use Cloud Shell for 120 days, your home disk is deleted (with warning). |
Concept | Definition / Explanation |
---|---|
Creating a Project | - Each project is a separate namespace for resources. - Must have a billing account linked (unless using only free-tier services). |
Switching Projects | - GCP Console “Project Selector” or gcloud config set project <PROJECT_ID> .- Each project has a unique ID (automatically assigned or custom). |
Linking to Billing | - Users need appropriate billing roles to link a project to a billing account (e.g., Billing Account User + Project Owner). |
Permissions | - Projects can be shared with other Google accounts via IAM roles. - E.g., Project Editor, Project Owner, Project Viewer. |
Managing Multiple Projects | - Best practice to isolate environments, e.g., dev/test/prod in separate projects. - Each project can have distinct roles, budgets, APIs enabled, etc. |
Concept | Definition / Explanation |
---|---|
Quotas | Resource usage limits for APIs/services at the project level (e.g., number of VMs, load balancers, requests per day). |
Types of Quotas | - Rate Quotas: e.g., requests per second, per day (resets after time). - Allocation Quotas: e.g., max number of VMs or CPU cores (must manually free up by deleting resources). |
Purpose | - Protect GCP users from accidental usage spikes. - Manage resource distribution among many customers. - Provide a limit you can request to raise if needed. |
Viewing Quotas | 1. Quotas Page (IAM & Admin → Quotas) for a full project-wide list. 2. API Dashboard (APIs & Services → select an API → Quotas) for per-API usage over time. |
Requesting Increases | - Select the quota, click “Edit Quotas,” specify new limit, and submit to Google for approval. - Approval is often within ~2 business days. |
Quota Monitoring | - Some services expose quota metrics in Cloud Monitoring (e.g., Compute Engine). - You can create custom dashboards and alerts for near-quota usage. |
Errors | Hitting a quota limit may return HTTP 429 or resourceExhausted (for gRPC). |
Concept | Definition / Explanation |
---|---|
Principle of Least Privilege | Grant only the minimum necessary permissions to users/services. Avoid broad roles (e.g., Owner, Editor) in favor of more granular ones. |
IAM | “Identity and Access Management” in GCP. Manages who (member) has what (role/permission) on which resource. |
Policy | Collection of statements/bindings specifying which members get which roles (and under what conditions). Attached to a resource (organization, folder, project, or resource). |
Binding | Binds a role to one or more members, plus optional conditions. |
Metadata (ETag, Version) |
- Etag: Concurrency control token that changes each time the policy is updated. - Version: Specifies policy schema version (1, 3 are common). Version 3 supports conditions. |
Audit Config | Specifies which permission types (Admin Read, Data Read, Data Write) get logged, and which identities are exempted. |
Member Type | Definition / Explanation |
---|---|
Google Account | A user with a Google identity (e.g., gmail.com or a managed account in your domain). |
Service Account | Special account for applications/VMs, not tied to a human. Used to authenticate workloads (e.g., GCE, GKE pods) to other GCP services. |
Google Group | A named collection of accounts/service accounts. Granting roles to the group implicitly grants them to all members. |
G Suite/Cloud Identity Domain | Represents all users under a specific domain (e.g., my-company.com). Can manage domain users centrally. |
allAuthenticatedUsers | Anyone with a Google Account/Service Account authenticated with Google. |
allUsers | Anyone on the internet (anonymous & authenticated). Highly risky—grants public access. |
Concept | Definition / Explanation |
---|---|
Permission | Action allowed on a service (e.g., compute.instances.start ). Typically follows the pattern service.resource.verb . |
Role | Named collection of permissions. You cannot grant permissions directly; you grant roles to members. |
Primitive Roles | Owner, Editor, Viewer. Very broad. Apply at project level. Google recommends avoiding them except for small cases—prefer more granular roles. |
Predefined Roles | Roles curated by Google for specific services. Provide fine-grained permissions. E.g., compute.instanceAdmin.v1 , storage.objectViewer . |
Custom Roles | User-defined roles bundling specific permissions. Not automatically updated by Google. Created at org or project level. Let you tailor exactly which permissions are included. |
Launch Stage (Custom Roles) | Each custom role has a stage: alpha, beta, or GA. Mainly for internal lifecycle tracking. |
Concept | Definition / Explanation |
---|---|
Hierarchy | Org → Folders → Projects → Resources. A resource inherits the union of policies from higher levels. |
Effective Policy | Union of the resource’s own policy + all inherited policies from ancestors. |
Policy Versions |
- v1: Standard (no conditions). - v2: Internal to Google. - v3: Supports conditions. |
Condition | A logic expression restricting the role binding to specific context (e.g. time-based, IP-based). If condition is false, no access is granted. |
Time-Based Access | Example: Grant role only until a certain date/time, or only during specific hours. |
Resource-Based Access | Example: Grant roles only for certain resource name patterns or regions. |
Concept | Definition / Explanation |
---|---|
Service Account | Non-human account for apps/VMs to access GCP APIs. Identified by unique email (e.g., my-sa@my-project.iam.gserviceaccount.com ). |
Types |
- User Managed: You create & manage. - Default: Auto-created for GCE/App Engine with Editor role by default. - Google Managed: For internal Google services (service agents). |
Authentication (Keys) |
- Google-managed keys: Private portion never exposed, automatically rotated. - User-managed keys: You hold the private key. Must rotate & secure it yourself (risk of compromise). |
Service Account Permissions | Service accounts can be granted roles (i.e., they’re an identity). Also, controlling who can “act as” (impersonate) a service account is crucial (via the “Service Account User” role). |
Access Scopes | Legacy mechanism for granting permissions on default service accounts. Modern approach is to use IAM roles on the service account. |
Best Practices (Service Accounts) |
- Use a separate service account per application component. - Avoid using default service accounts in production—create custom ones with minimal roles. - Rotate external (user-managed) keys frequently. - Keep keys out of source code. |
Concept | Definition / Explanation |
---|---|
Cloud Identity | Google’s Identity-as-a-Service solution. Centrally manages users/groups, enforces policies (2SV, password rules), SSO, device management, and more. |
Device Management | Enforce security policies on users’ mobile or desktop devices, e.g. passcodes, wiping corporate data on departure. |
Security Features |
- 2-Step Verification: Mandate strong multi-factor authentication. - Password Policies: Centralized control of password complexity, rotation, etc. |
Single Sign-On (SSO) | Users log in once with corporate credentials (Cloud Identity or G Suite) to access multiple apps. Supports SAML, OAuth, OpenID, AD FS, etc. |
Reporting & Audits | Audit logs for user logins, group changes, device changes. Export to BigQuery for analysis. |
Directory Management | Sync with on-prem or external identity providers (Active Directory, LDAP) using Google Cloud Directory Sync (GCDS). |
Google Cloud Directory Sync | A tool that synchronizes user accounts, groups, and directory data from on-premises LDAP directories to Google Cloud services. |
Identity federation | A set of protocols and practices that enable an external identity provider to authenticate users, allowing access to multiple systems using a single set of credentials. |
Best Practice | Explanation |
---|---|
Use Least Privilege | Grant only necessary permissions—prefer narrower roles (e.g., predefined) over broad roles (Owner/Editor). |
Use Groups | Assign IAM roles to Google groups rather than individual users. Makes membership changes simpler without editing the policy. |
Set Policies at the Appropriate Level | E.g., if you only need to grant roles for a single project, don’t do it at the organization or folder level. |
Control Service Account Creation | Limit who can create or manage service accounts—because someone who can impersonate a high-privilege service account can access all resources that account has. |
Rotate Keys | For any user-managed service account keys, rotate them periodically to prevent compromise. |
Check Audit Logs | Monitor logs for suspicious policy changes and/or key usage. Export them to Cloud Storage or BigQuery for long-term retention. |
Minimize Default SA Usage | Don’t rely on default service accounts (often have broad Editor role). Create custom SAs with narrower permissions. |
Mirror Org Structure | Use folders/projects to match your organization’s departments/teams for logical separation and policy inheritance. |
Concept / Layer | Definition / Explanation |
---|---|
OSI Model (7 Layers) | A conceptual model for how data moves through a network: Physical → Data Link → Network (Layer 3) → Transport (Layer 4) → Session → Presentation → Application (Layer 7). |
IPv4 Addressing | 32-bit address written as dotted decimals (e.g., 192.168.0.1 ). Divided into network + host portions. RFC 1918 private ranges:• 10.0.0.0/8 • 172.16.0.0/12 • 192.168.0.0/16 |
CIDR (Classless Inter-Domain Routing) | Replaces “classful” A/B/C approach with flexible prefix notation (e.g., /16 ). The larger the slash number, the smaller the network size, e.g. /24 = 256 addresses; /16 = 65,536 addresses. |
IPv6 Addressing | 128-bit hexadecimal notation (e.g., 2001:0db8:85a3::8a2e:0370:7334 ). Can shorten zero blocks with :: . Uses /64 for many typical subnets. |
Transport (Layer 4) | - TCP (Transmission Control Protocol) ensures reliable, ordered delivery. - UDP (User Datagram Protocol) is a simpler, connectionless protocol often used for DNS or streaming. |
Application (Layer 7) | Protocols like HTTP(S), DNS, SSH, SMTP, etc. The highest layer where user-facing apps / services operate. |
Concept | Definition / Explanation |
---|---|
VPC Overview | A global, software-defined network in GCP. Spans all regions. Contains subnets (regional). Allows internal communication over private IPs within the same VPC. |
Global Resource | VPC itself is a global resource, but subnets are per-region. |
Default Network | Created automatically in new projects (unless disabled by an org policy). It’s an auto mode VPC with one predefined subnet per region (using 10.128.0.0/9 block). Includes default firewall rules for SSH, RDP, ICMP, and internal traffic. |
Auto Mode vs. Custom Mode | - Auto Mode: Automatically creates one subnet per region. Subnets are assigned from the 10.128.0.0/9 range. Can be converted to custom mode.- Custom Mode: No subnets by default; you manually create subnets & define IP ranges. Recommended for production. |
VPC Peering / VPN | Separate VPCs typically can’t communicate via internal IPs unless you set up VPC peering or a VPN / Interconnect. |
Default Firewall Rules | Default VPC includes rules allowing inbound SSH, RDP, ICMP from any source, and all protocols/ports inside the network. Modify or remove as needed for security. |
Concept | Definition / Explanation |
---|---|
Subnets | Regional partitions of a VPC network’s IP space. Contain primary (and optionally secondary) IP ranges. |
Primary vs. Secondary Range | - Primary: The main CIDR block used for VM instance IP assignments. - Secondary: (Optional) Additional CIDR blocks for scenarios like container alias IPs, etc. |
Subnet Expansion | You can expand a subnet’s IP range without downtime, as long as it doesn’t overlap with existing subnets. Once expanded, it cannot be reverted to a smaller range. |
Auto Mode Subnets | Created automatically for each region with default CIDR blocks. Each region’s block can be expanded (up to /16), or you can convert the entire VPC to Custom Mode for more control. |
Reserved IPs | Each subnet’s primary range reserves 4 IP addresses (network, default gateway, future use, broadcast). Secondary ranges do not have reserved IPs. |
Concept | Definition / Explanation |
---|---|
Routes | Define how traffic exits a VM to reach a destination (either inside or outside the VPC). Each route has a destination range + next hop. |
System-Generated Routes | - Default Route: 0.0.0.0/0 → Default Internet Gateway. Priority 1000. You can remove/replace it if you want full isolation.- Subnet Routes: One route per subnet’s primary and secondary range. Priority 0, more specific than default route. Cannot be removed separately. |
Custom Routes | - Static: Manually created or set up with policy-based VPN. - Dynamic: Managed by Cloud Router (using BGP for Cloud VPN/Interconnect). |
Routing Priority | Lower number = higher priority. For identical destination ranges, the route with the smallest priority value wins. |
Private Google Access | VM instances without external IPs can still access Google APIs/services by enabling this on their subnet. Traffic to Google stays on Google’s backbone rather than going out to the public internet. |
Use Cases for PGA | - Subnet without external IP addresses. - On-prem to GCP via VPN/Interconnect. - GCP serverless or VPC peering (private services access). |
Concept | Definition / Explanation |
---|---|
Internal vs. External IP | - Internal IP: Reachable only within the same VPC (private). - External IP: Reachable from the public internet (if firewall allows). |
Ephemeral vs. Static | - Ephemeral: Auto-assigned, released when resource is stopped/deleted. - Static: Reserved and remains allocated to your project until released. |
Internal IP Allocation | - Automatically assigned from subnet’s IP range. - You can specify an address or reserve one. - Alias IP ranges let you define multiple IPs on a VM (e.g., container pods). |
External IP Allocation | - Ephemeral assigned if you launch a VM with external access (and you don’t specify a static one). - Can reserve a static external IP (regional or global). Regional → used by VMs / LBs in that region. Global → used by global LBs. |
Promotion (Ephemeral → Static) | You can take an ephemeral IP (internal or external) in use by a resource and promote it to static so it won’t change. |
Bringing Your Own IP (BYOIP) | You can import your own publicly routable IP prefixes (min /24 ) to GCP. Must prove ownership. |
Action | Definition / Explanation |
---|---|
Reserve a Static Internal IP | 1. On VM creation (Console → Networking → Reserve static internal IP). 2. Or create VM with ephemeral IP, then promote it to static. 3. Must be within the subnet’s CIDR range. |
Reserve a Static External IP | 1. Go to VPC Network → External IP addresses → Reserve static address. 2. Assign it to a VM or load balancer. 3. (Optional) Promote ephemeral external IP to static. |
Promote Ephemeral → Static | Convert a currently-used IP to a persistent assignment (for internal or external addresses). Prevents IP changes on VM stop/start. |
Deleting / Releasing | Remember to remove static IPs when no longer needed; otherwise, they incur charges even if unattached. - For internal IPs, use gcloud or re-assign in the networking settings. - For external IPs, “Release static address” in the console or via gcloud. |
gcloud compute addresses | - list : View addresses in your project (internal & external).- create : Reserve a static IP.- delete : Release it. |
Concept | Definition / Explanation |
---|---|
Distributed Firewall | Each VPC has a distributed, stateful firewall at the VM level. Rules apply inbound (ingress) or outbound (egress). |
Implied Rules | - Allow Egress: All outbound traffic is permitted unless blocked by a higher-priority rule. - Deny Ingress: All inbound traffic is denied unless allowed by a firewall rule. |
Default Rules | In the default VPC, rules allow ICMP, RDP(3389), SSH(22) inbound from any source and all protocols within the network. Priority 65534. |
Firewall Rule Components | - Direction: Ingress or Egress. - Action: Allow or Deny. - Targets: Which VMs (all, by tags, by service account). - Source/Dest: IP ranges, tags, or service accounts. - Protocols/Ports: e.g. tcp:22, icmp, etc. - Priority. |
Stateful | Once a connection is allowed, the return traffic is automatically allowed (connection tracking). |
Enable / Disable | You can disable a rule without removing it (handy for troubleshooting). |
Action | Definition / Explanation |
---|---|
Create Custom VPC | 1. VPC Network → Create VPC. 2. Choose Custom subnet mode (no automatic subnets). 3. Manually add subnets, specifying region + CIDR. |
Add Public / Private Subnets | E.g., “public” subnet with external IP usage, “private” subnet with no external IP addresses. |
Enable Private Google Access | Allows VMs with no external IP to still reach Google APIs/services (Cloud Storage, etc.) over internal IP. - Turned on at the subnet level. |
Create Instances | - Public instance: ephemeral or static external IP, can reach internet. - Private instance: no external IP. Must rely on private Google Access or direct connection from the public instance to reach outside resources. |
Firewall Rules | - E.g., allow SSH from 0.0.0.0/0 to public instances, allow internal traffic from public→private. - Use target tags (e.g., “public” / “private”) to limit scope. |
Verification | - SSH into public instance from internet. - From public → private instance (SSH or ping). - Check Cloud Storage access from private instance via private Google Access (no external IP). |
Concept | Definition / Explanation |
---|---|
VPC Peering | Privately connect two VPC networks (in the same or different projects/orgs) so their internal IPs can talk without traversing the public internet. |
Supported | - All subnet routes are exchanged. - Optionally, custom static routes can be imported/exported. - Reduces egress costs, latency, and improves security. |
Restrictions | - No transitive peering (A↔B, B↔C doesn’t imply A↔C). - Subnet IP ranges must not overlap. - Each side must configure the peering, must be “active” on both sides. |
Separate Admins | Each VPC is managed independently (its own firewall rules, routes, etc.). Peering simply provides private connectivity. |
Demo Steps | 1. Create two custom VPC networks (e.g., NetA, NetB) in separate projects. 2. Create VMs in each (firewall rules to allow SSH/ICMP). 3. Under VPC Peering, create connection from NetA to NetB, then from NetB to NetA. 4. Test connectivity by pinging internal IPs. |
Concept | Definition / Explanation |
---|---|
Shared VPC | Lets multiple projects share a common VPC network in a “host project.” Service projects attach to the host’s shared VPC. Instances in service projects get IP addresses from the host’s shared subnets. |
Host Project | Contains the shared VPC network (one or more). Must belong to an organization. Administrators of the host project can grant subnets to service projects. |
Service Project | Project “attached” to a host project’s shared VPC. VMs created in the service project can use subnets from the host’s shared network. |
Roles | - Shared VPC Admin: Can enable host projects, attach service projects, and delegate subnet usage. - Service Project Admin: Manage resources in the service project. May have project-level or subnet-level usage permissions on the host project. |
Use Cases | 1. Simple Shared VPC: Single host project with multiple service projects. 2. Multiple Host Projects: e.g., dev vs. prod. 3. Hybrid: On-prem + host project with shared subnets. 4. Multi-tier: Different service projects for web vs. back-end tiers. |
Standalone Project | Neither a host nor a service project. Uses its own VPC as normal. |
Concept | Definition / Explanation |
---|---|
VPC Flow Logs | Captures a sample of network flows to and from VM instances (including GKE nodes). Used for real-time visibility into traffic, forensics, capacity planning, cost optimization, etc. |
Enable on Subnet | Flow logs are enabled on a per-subnet basis. All VMs in that subnet then produce flow logs in real time. |
Sampling Rate | Approximately 1 out of every 10 packets is captured. The sampling rate is set by Google Cloud and cannot be changed. |
Data Export | - Cloud Logging for 30 days (by default). - Can export logs to Cloud Storage for longer retention or to BigQuery for analysis. |
Use Cases | - Network Monitoring (throughput, performance). - Real-Time Security (send logs to SIEM systems, detect anomalies). - Forensics (trace suspicious IP traffic). - Cost / Capacity (see traffic flows, optimize egress). |
Log Format | - Base fields (always included) plus optional metadata fields (e.g., GKE annotations). - Can filter logs to only store what you need. - Viewed in Cloud Logging (classic/preview logs viewer). |
Concept | Definition / Explanation |
---|---|
Domain Name System (DNS) | A global, hierarchical, distributed database for translating human-friendly domain names (e.g., google.com) into IP addresses (e.g., 172.217.x.x). |
Root (.) | The top of the DNS hierarchy—13 DNS root servers each respond for TLD references (like .com, .net). |
Top-Level Domain (TLD) | E.g., .com, .org, .net (generic TLD), or .uk, .ca (country-code TLD). TLD name servers point to authoritative name servers for your domain. |
Second-Level Domains | Your registered domain (e.g., bowtieinc.co). Often purchased via a domain registrar. Typically delegates to subdomains (e.g., dev.bowtieinc.co). |
DNS Resolver | A server (often ISP-provided) that recursively queries DNS on your behalf. Caches results based on TTL to speed up subsequent requests. |
Zone File | Contains DNS records (A, AAAA, MX, CNAME, etc.) for a domain (a zone). Hosted by an authoritative name server. |
Caching & TTL | Resolvers store DNS records in memory for a “time to live” period. Low TTL = more frequent updates but more queries. High TTL = fewer queries, but changes propagate slower. |
Lookup Steps | 1. Client queries DNS resolver. 2. Resolver contacts root name server, then TLD server, then authoritative server. 3. Authoritative server returns IP address. 4. Resolver caches result, returns to client. |
Record Type | Definition / Explanation |
---|---|
NS (Name Server) | Specifies the authoritative name servers for a domain. E.g., the domain’s DNS is served by ns1.example.com , ns2.example.com . |
A / AAAA | - A: Maps a domain to an IPv4 address. - AAAA: Maps a domain to an IPv6 address. |
CNAME | Canonical name record. Points one domain name to another. E.g., ftp.bowtieinc.co → bowtieinc.co . |
TXT | Holds arbitrary text data, often used for domain ownership verification (e.g., Google Workspace), SPF/DKIM records for email security, or other meta info. |
MX (Mail eXchange) | Specifies mail server(s) for handling email for a domain. Includes a priority value (lower = higher priority). E.g., bowtieinc.co. IN MX 10 mail.bowtieinc.co. |
PTR (Pointer) | Reverse DNS record. Maps an IP address back to a domain name. Stored in special .in-addr.arpa (IPv4) or .ip6.arpa (IPv6) zones. Often used for logging, spam checks. |
SOA (Start of Authority) | Holds zone-level data, e.g., admin email, serial number, and refresh intervals. Exactly one per zone. Ensures correctness and zone authority. |
Concept | Definition / Explanation |
---|---|
NAT | Translates private (RFC 1918) IP addresses to a public IP (or pool of public IPs) to enable internet-bound traffic. Also can hide real IP addresses for security. |
Static NAT (1-to-1) | A private IP is permanently mapped to a single public IP. Outbound & inbound traffic can occur using that mapped public IP. Often used if a device must be reachable externally on a stable IP. |
Dynamic NAT (Many-to-Few) | A pool of public IPs is shared among private addresses. IPs from the pool are allocated on demand. Released back into the pool after usage. |
PAT (Port Address Translation) (Many-to-1) | Multiple private IPs share a single public IP. NAT device uses unique source ports to track connections. E.g., typical home router scenario, also used by GCP’s Cloud NAT. |
Use Cases | - Dealing with limited public IP addresses. - Securing private networks from direct internet exposure. - Common home/office router scenario. |
Concept | Definition / Explanation |
---|---|
Cloud DNS | Google’s managed authoritative DNS service. Fully distributed, high availability (globally). Manages DNS zones & records for domains. |
Public vs. Private Zones | - Public: DNS data is visible over the internet. Typically used for external domain hosting. - Private: DNS data accessible only from within your VPC network(s). |
Managed Zones | A “DNS zone” hosted by Google’s DNS name servers. You create records (A, CNAME, MX, etc.) for your domain. - Public zone usage requires domain purchase from a registrar (not provided by Cloud DNS). |
Authoritative Name Servers | Cloud DNS automatically allocates name servers for your zone. You update your domain’s NS records at your registrar to point to these. |
Records & Record Sets | Within a zone, you define resource record sets (e.g., www → A record). An “SOA” and “NS” record are created by default. |
Usage | - Host a public DNS domain (point domain registrar’s NS to Cloud DNS). - Host a private DNS zone for internal name resolution (works only within your VPC or with DNS peering). |
Concept | Definition |
---|---|
Private Google Access | Enables VM instances without external IPs to access Google APIs and services through internal connectivity. |
Private Service Connect | Provides internal endpoints in a VPC network to privately connect to managed services using internal IP addresses. |
VPC Service Controls | Enhances security by establishing a security perimeter around Google Cloud resources to mitigate data exfiltration risks. |
Serverless VPC Access | Allows serverless environments to securely connect to VPC networks using internal IP addresses without requiring a public IP connection. |
Concept | Definition / Explanation |
---|---|
Bare Metal Model | One OS running directly on the server hardware. Typically not flexible, underutilizes resources. |
Hypervisor | A software layer (also called a VMM—Virtual Machine Monitor) that enables multiple OSes (VMs) to share and manage the same host hardware. |
Full Virtualization | Emulates all hardware in software. Early approaches used binary translation, which was slow. |
Para-Virtualization | Modified guest OS communicates directly with hypervisor (no full emulation). Improves performance, but requires guest OS changes. |
Hardware-Assisted Virtualization | Modern CPUs have virtualization extensions (Intel VT-x, AMD-V). Hypervisor leverages these to run unmodified OSes efficiently. Reduces overhead (no heavy binary translation). |
Kernel-Level Virtualization | The hypervisor is part of the OS kernel itself (for example, KVM on Linux). VMs are treated like user-space processes. This approach powers GCP’s Compute Engine (with nested virtualization support). |
Nested Virtualization | Running a hypervisor (and VMs) inside another VM. Google’s kernel-level virtualization approach supports this. Useful for migrating on-prem VM images without big changes. |
Benefits | - Better Resource Utilization (multiple OSes on same hardware). - Isolation (one VM crash doesn’t affect others). - Flexibility (spin up VMs on demand). |
Concept | Definition / Explanation |
---|---|
Compute Engine | Google Cloud’s IaaS offering for running VMs (“instances”). Google manages the underlying hardware, data centers, networking, etc. |
VPC Integration | Instances live in a VPC subnet. Must choose region/zone upon creation, attach disk, and configure networking. |
Pricing | Pay per second (minimum 1 minute). Sustained use discounts or Committed use discounts can reduce cost. |
Core Configuration | 1. Machine Type (vCPU + memory). 2. OS Image (public, custom, marketplace). 3. Disk Type (standard, SSD, local SSD). 4. Network (VPC, subnets, firewall rules). |
Multitenant vs. Sole-Tenant | - Multitenant: Default. The physical host is shared with other customers. - Sole Tenant: Dedicated physical host (for compliance or performance reasons) at higher cost. |
Action | Definition / Explanation |
---|---|
Name & Labels | - Name: Unique within the project. - Labels: Key-value pairs to help organize resources (e.g. env=dev). |
Region & Zone | Choose a region, then a zone for your VM. Once picked, cannot be moved. |
Machine Configuration | - Choose from pre-defined (general purpose, compute/mem optimized) or custom machine types. - Possibly add GPUs (for n1 type). |
Boot Disk | - Select OS image from “public images,” custom images, or marketplace solutions. - Choose disk type (standard, balanced, SSD) and size. |
Management / Security / Networking | - Management: Add startup scripts, availability policies, etc. - Security: Shielded VM, OS Login, disabling project-wide SSH keys. - Networking: Subnet, external IP, network tags, etc. |
SSH / RDP | If Linux: typically SSH on port 22. If Windows: RDP on port 3389. Must have firewall rules to allow inbound traffic. |
Machine Family | Definition / Explanation |
---|---|
General Purpose | Balanced CPU/memory. Good for a wide variety of workloads like web apps, small/medium DBs, dev/test, etc. Families: E2, N1, N2, N2D. |
Compute Optimized (C2) | Highest performance per core, ideal for compute-intensive tasks (HPC, gaming, single-threaded workloads). Only available as predefined machine types. |
Memory Optimized (M1/M2) | Ultra-high memory for large in-memory DBs (SAP HANA) or analytics. Up to 12 TB RAM. Only available as predefined. |
Predefined Types | Google provides a set of standard shapes (e.g., n2-standard-4). Also have high-memory or high-CPU variants. |
Custom Machine Types | Define your own vCPUs & memory (within limits). Available for general-purpose families (e2, n1, n2, n2d). Perfect if pre-defined types don’t match your ratio needs. Slightly higher cost than an equivalent pre-defined type. |
Shared-Core Types | F1-micro, G1-small (N1) or e2-micro/small/medium. These use fractional CPU allocation & can burst CPU usage occasionally. Low-cost, best for small workloads, dev/test, or rarely used services. |
GPUs | Attach NVIDIA GPUs (e.g., Tesla K80, P100, etc.) only on n1 machine types. Used for ML training, HPC, or 3D rendering. |
Concept | Definition / Explanation |
---|---|
Instance Lifecycle | - Provisioning → Staging → Running → (stop/suspend/terminate). - Paying for CPU & memory only when in Running or Repair states. You still pay for attached disks/IPs even if suspended/stopped. |
Stopping / Suspending | - Stop: Shuts down the OS. Then transitions to Terminated. Does not incur CPU cost, but you pay for static IP & disks. - Suspend: Similar to “close laptop lid.” VM state & memory is preserved, but you still pay for disk & IP. |
Live Migration | GCP can move your running VM to another host during maintenance without reboot. You can also do manual cross-zone moves within a region. |
Shielded VMs | Ensures verifiable integrity of the VM’s boot sequence. Components: secure boot, vTPM, measured boot, integrity monitoring. Prevents low-level rootkits or boot malware. |
Guest Environment | Scripts & daemons installed in the OS that handle instance setup (metadata, ssh key injection, etc.). Public images come with it by default. For custom images, you may need to install it yourself. |
Metadata & Startup Scripts | - Metadata: Key-value pairs accessible via http://metadata.google.internal. - Startup/Shutdown Scripts: Scripts set in metadata or instance config to run on VM boot / shutdown. |
OS Login | An alternative to managing SSH keys in instance/project metadata. Ties SSH access to IAM roles. Allows 2FA for SSH. |
Windows Login | Use “Set Windows Password” to generate credentials. Connect via RDP on port 3389. Alternatively, use OS Login for Windows if you want. |
Action | Definition / Explanation |
---|---|
Linux SSH | - Via Console: “SSH in Browser.” - Via Cloud Shell or local gcloud: gcloud compute ssh instance-name .- OS Login recommended for user management. |
Windows RDP | - Enable RDP inbound firewall rule on port 3389. - Use “Set Windows Password” in the console or gcloud .- Then RDP with IP, username, password. |
SSH Key Management | - If not using OS Login, store public keys in instance or project metadata. - Possibly block project-wide SSH keys if you want instance-level only. |
Powershell Remoting (WinRM) | - If using remote PowerShell on Windows, open port 5986. - Provide credentials. |
Browser-based | - “Open in browser window” for quick SSH. - For Windows, a “Chrome RDP” extension or .rdp file. |
Concept | Definition / Explanation |
---|---|
Instance & Project Metadata | - Metadata is stored as key-value pairs, accessible within GCP via http://metadata.google.internal/. - There are default metadata (e.g., instance name, zone) + custom metadata (user-defined) at project or instance level. - Default metadata is always present; custom metadata can be set in the console, CLI, or API. |
Startup & Shutdown Scripts | - Startup Scripts run on VM boot (e.g., install packages, configure software). - Shutdown Scripts run on VM shutdown (e.g., cleanup tasks, exporting logs). - Stored in metadata (key: `startup-script` or `shutdown-script`), or in a file that references a Cloud Storage URL. |
Use Cases | - Dynamic config: e.g., pass parameters to a startup script using metadata. - Automated installs & updates. - Automatic data exports on shutdown. |
Metadata Queries | - Use `curl` or `wget` with the special header `Metadata-Flavor: Google`. - Endpoints: `/computeMetadata/v1/instance/...` or `/computeMetadata/v1/project/...`. - Common queries: instance name, zone, custom metadata (under `/attributes`). |
Block Project-Wide SSH Keys | - Instance metadata can override project-wide keys. - Checking “Block project-wide SSH keys” means only keys in that instance’s metadata or OS Login apply. |
Concept | Definition / Explanation |
---|---|
Resource-Based Billing | - vCPU, memory, disk, etc. are each billed individually. - On-demand usage billed per second (1-minute minimum). |
Instance Uptime | - Billed while instance is running (or in “repair” state). - Stopped/suspended instances do not incur vCPU/memory costs, but disks and static IPs still accrue charges. |
Reservations | - You can reserve VM resources in a zone for future use. - Pay on-demand rates while reserved. - Ensures capacity is always available to you. - Still eligible for sustained/committed use discounts. |
Sustained Use Discounts | - Automatic discounts for running certain VMs for a significant fraction of the month. - Up to 30% off for N1 (vCPU + memory), 20% for N2/N2D/C2. - Combine usage across same region & VM type for bigger discount. |
Committed Use Discounts | - 1-year or 3-year commitment for vCPU/memory/gpu. - Up to 57% (or 70% for memory-optimized) discount. - Pay monthly whether you use it or not. - If usage > commitment, extra is at on-demand rate. - E2, N1, N2, N2D, C2 are supported. |
Preemptible VMs | - Up to 80% cheaper than on-demand. - Compute Engine can shut down (preempt) your VM at any time, and definitely after 24 hours. - Ideal for batch or fault-tolerant workloads. - No SLA, no live migrate, no automatic restart. |
Spot VMs |
- Cost-effective alternative to on-demand VMs, often up to 60-91% cheaper. - Compute Engine can reclaim Spot VMs at any time when needed for other workloads. - No 24-hour limit like Preemptible VMs, allowing longer runtimes if capacity remains available. - Ideal for batch processing, machine learning training, CI/CD jobs, and fault-tolerant workloads. - No SLA, no live migration, no automatic restart, but can be combined with managed instance groups for resiliency. |
Concept | Definition / Explanation |
---|---|
Block Storage | - Data split into evenly sized blocks, each with unique ID. - Presented to OS as raw volume/hard drive. - Fastest type, often used as boot volumes (e.g., persistent disks, local SSD on Compute Engine). |
File Storage | - Data structured in hierarchical directories (files/folders). - Already structured, usually network-attached (e.g., NFS). - In GCP, provided via Filestore service (not bootable, purely shared file storage). |
Object Storage | - Data stored as “objects” with metadata + unique ID. - Infinitely scalable, often used for unstructured data (images, logs). - In GCP, Cloud Storage is object storage (flat namespace, not directly bootable, but can be FUSE-mounted). |
Performance Terms | - I/O (IO): Single read/write request. - Queue Depth: # of IOs pending. - IOPS: IO operations per second. - Throughput: data transfer speed (MB/s). - Latency: time for each IO to complete (ms). - Sequential vs. Random: large sequential vs. scattered small ops. |
Concept | Definition / Explanation |
---|---|
PD Types | 1. Standard (pd-standard): Backed by HDD, cheapest, best for sequential IO (large reads/writes). 2. Balanced (pd-balanced): Mid-tier cost/performance, good general-purpose option. 3. SSD (pd-ssd): Fastest PD type with low latency, high IOPS, higher cost. |
Zonal vs. Regional | - Zonal: Resides in a single zone. - Regional: Synchronously replicated across two zones in same region for higher availability. - Regional is slower & more expensive but more fault-tolerant. |
PD Characteristics | - Max 64 TB per disk. - Disks are network-attached (not physically attached). - Resizable online (bigger only). - Encrypted at rest (can use default or customer-managed keys). - Independent lifecycle from the VM (attach/detach, keep disk after VM delete). |
PD Performance | - Performance scales with disk size & vCPU count. - Must have enough vCPUs to drive desired IOPS. - For standard/balanced/SSD PD: the bigger the disk, the higher the IOPS/throughput. |
Snapshot | - Incremental backups at block level. - Typically used for zonal PD to replicate data or keep backups. - Snapshots can be stored in multi-regions or single region. |
Concept | Definition / Explanation |
---|---|
Local SSD | - Physically attached to the host server. - Highest IOPS & lowest latency. - Limited to 24 x 375 GB partitions = max 9 TB. |
Volatile Data | - Data is lost when VM is stopped, deleted, or moved. - Good for caches, scratch data, or ephemeral workloads. |
NVMe vs. SCSI | - SCSI is older, single queue. - NVMe (non-volatile memory express) is newer, supports many queues & commands, typically offers higher IOPS/throughput. |
Availability | - Only for N1, N2, and compute-optimized VM families. - Not attachable/detachable. Must be chosen at instance creation. |
Performance | - Very high read/write ops (millions of IOPS). - Lower latency than PD. |
Action | Definition / Explanation |
---|---|
Create a Persistent Disk | - Zonal or Regional. - Blank or from image/snapshot. - Choose type (standard, balanced, SSD) & size. |
Attach/Detach | - Attach disk to a running VM or a stopped VM (except local SSD). - On Linux, must format + mount. - On Windows, must initialize in Disk Management. |
Resizing a Persistent Disk | - Disks can be expanded without downtime. - Must resize the filesystem inside the OS (e.g., `resize2fs` on Linux). |
Mounting & FSTAB | - After formatting, create mount point & add entry to `/etc/fstab` (Linux) for auto-mount on reboot. |
Data Persistence | - PD remains intact even if VM is deleted (unless “delete disk” is selected). - Local SSD data is lost on VM stop/delete. |
Deleting Disks | - Must detach first from a running VM (or delete VM if boot disk). - Freed resources stop incurring cost. |
Snapshot | - Create from a disk for backup or migration. - Snapshots are incremental, can restore to new disk or instance. |
Concept | Definition / Explanation |
---|---|
Snapshot Fundamentals | - Snapshots are incremental, point-in-time backups of persistent disks (zonal or regional). - Snapshots can be taken from running or stopped instances (disks do not have to be detached). - They are global resources: can create new disks in any region from a snapshot. |
Location & Storage | - Stored in Cloud Storage. Choose multi-regional (higher availability, higher cost) or regional (lower cost, but limited to a single region). - If snapshot region = disk region, no network charge for snapshot/restore in that region. |
Incremental & Compression | - The first snapshot is a full snapshot of the disk. - Subsequent snapshots only store changed or new blocks since the last successful snapshot. - Snapshots are compressed automatically. |
Deleting Snapshots | - Deleting a snapshot does not necessarily remove all of its data if other snapshots depend on it. - Deployment Manager will manage references among snapshots, adjusting block references as needed (some blocks may move to next snapshot). |
Frequency & Best Practices | - Minimum 10 minutes between snapshots of the same disk. - Regular snapshots reduce data-loss risk. - Off-peak snapshot times = faster & cheaper if data changes are fewer. - For Windows: Volume Shadow Copy Service (VSS) can be used for consistent snapshots. |
Concept | Definition / Explanation |
---|---|
Manual Snapshots | - One-off snapshots can be created from the console or CLI. - Must specify the source disk, snapshot name, and region or multi-region storage location. - Snapshots are incremental, so repeated snapshots are quick & cost less. |
Snapshot Schedules | - Automate periodic snapshots of a given disk. - One schedule per disk; must be in the same region as the disk. - Optionally define snapshot retention (e.g., “keep 14 days”), source disk deletion rule. - Attachable or detachable from a disk. |
Manage Snapshots & Schedules | - You can detach a schedule or delete it after detaching from all disks. - Schedules cannot be edited. Instead, remove and re-create with different settings. - Snapshots remain until manually deleted or retention policy cleans them up. |
Creating a Disk from Snapshot | - Create new disk in any region from an existing snapshot. - The new disk can then be attached to a VM as a data disk or a boot disk (if snapshot is from a bootable disk). |
Concept | Definition / Explanation |
---|---|
Deployment Manager | - GCP’s infrastructure-as-code tool for automating resource provisioning. - Uses YAML for configurations + optional Jinja or Python templates. - Deploy, update, and delete resources in a single, repeatable workflow. |
Key Components | - Configuration: The main YAML file describing resources. - Template(s): Reusable building blocks (Jinja/Python). - Deployment: A collection of resources managed together. |
Resource Types | - Base types: e.g., compute.v1.instance . - Composite types: e.g., gcp-types/compute-v1:instances (bundled sets of resources). |
Properties & References | - Properties: Parameters for the resource. Must match the specific API’s fields (e.g., machineType, network). - References: Let one resource read values from another resource (e.g., $(ref.resourceName.selfLink) ). |
Manifests | - Read-only descriptor for each deployment. Auto-created when you deploy. - Summarizes expanded config + resources (similar to a “compiled” version). |
Concept | Definition / Explanation |
---|---|
Workflow | 1. Write config (.yaml ) + optional templates (.jinja /.py ). 2. Preview (--preview ) or deploy (gcloud deployment-manager deployments create ... ). 3. Update (gcloud deployment-manager deployments update ... ) or delete resources. |
Preview Mode | - Doesn’t provision any resources. - Helps catch errors in your config or templates before real deployment. |
Templates | - Split configurations into smaller re-usable .jinja or .py files. - Use environment variables + custom properties to handle dynamic values. |
References | - Use $(ref.myResource.property) to refer to output from one resource as an input to another. - Ensures correct order of creation (dependency). |
Best Practices | - Keep separate config for major categories (e.g., network vs. compute vs. security). - Always preview changes. - Use version control (Git) + automation (CICD). - Use references to handle resource dependencies. - Automate project creation if needed. |
Concept | Definition / Explanation |
---|---|
Load Balancer Purpose | Distribute traffic across multiple resources (VMs, instance groups, etc.) to increase availability, reduce latency, and improve overall user experience. |
Software-Defined & Global | GCP load balancers are fully software-defined, no hardware devices needed. Certain GCP load balancers can be global (premium tier) or regional (standard tier). |
Forwarding Rule | Directs traffic (based on protocol and port) to a target backend (e.g., backend service or target pool). |
Backend Service | Defines how the load balancer distributes traffic to back ends. Contains settings like health checks, session affinity, timeout, and references to instance groups or NEGs as back ends. |
LB Type | Key Characteristics |
---|---|
HTTP(S) Load Balancing | - Global, Layer 7 (application). - Proxy-based: terminates HTTP(S) traffic at Google Front Ends (GFE). - Supports cross-region distribution & content-based routing (URL maps). - IPv4 & IPv6, IPv6 terminates at LB → forwards IPv4 to backend. - Premium-tier = global; standard-tier = regional. |
SSL Proxy Load Balancing | - Global, Layer 4 (TCP over SSL). - Terminates SSL at LB, re-encrypt or pass plain TCP to backend. - IPv4 & IPv6, IPv6 terminates at LB. - Only supports TCP with SSL (proxy). |
TCP Proxy Load Balancing | - Global, Layer 4 (TCP). - Proxy-based: Terminates TCP at LB, can re-establish TCP or SSL to backend. - IPv4 & IPv6, IPv6 terminates at LB. |
Network Load Balancing (NLB) | - Regional, Layer 4 (TCP/UDP). - Pass-through: no termination, direct server return. - Balances TCP, UDP, SSL (self-managed). - Great for non-HTTP protocols needing direct IP:port LB. |
Internal Load Balancing (ILB) | - Regional, Layer 4 (TCP/UDP). - Internal only, not internet-facing. - Balances traffic within a VPC (private IP addresses). |
Concept | Definition / Explanation |
---|---|
Instance Template | A resource that defines a VM’s configuration (machine type, disk, metadata, etc.). Re-used to create multiple VMs or Managed Instance Groups (MIGs). |
No Editing Templates | Once created, cannot edit. Must create a new template if a config changes. |
Usage | - Create MIG using an instance template. - Optionally base a new template on an existing instance. - Includes OS images (public/custom), metadata, machine type, disks, network, etc. |
Concept | Definition / Explanation |
---|---|
MIG Overview | - A fleet of identical VMs (stateless recommended). - MIG automatically handles scaling, healing, auto-updates, multi-zone/regional deployments, etc. - Must use instance templates to create identical VMs. |
Auto Healing | - Uses MIG health checks to replace unhealthy instances automatically. - Distinct from LB health checks (which only remove from traffic, not recreate). |
Auto Scaling | - Automatically add/remove VMs to match load (CPU utilization, custom metrics, LB-based). - Scales in to reduce cost, out to handle demand. |
Rolling Updates | - Update MIG with minimal downtime (gradual replacement). - Optionally do canary (partial rollout) with controlled pace. |
Regional vs. Zonal MIG | - Regional MIG: Distribute instances across multiple zones in the same region (higher availability). - Zonal MIG: All instances in a single zone. |
Preemptible VMs | - MIG can include preemptible instances for cost savings. - Auto healing replaces them if capacity is available when preempted. |
Stateful MIG | - Keep per-instance state (e.g., persistent disk, instance name). - Useful for partial stateful apps or unique configs, but MIG still handles auto healing. |
Concept | Definition / Explanation |
---|---|
Unmanaged Instance Group | - Heterogeneous (mixed machine types, OS, etc.). - No auto scaling, auto healing, or rolling updates. - Use only if you need load balancing across a custom set of distinct instances that you manage manually. |
No Templates | - You add existing instances to an unmanaged group. You handle all lifecycle events. |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Containers | Lightweight application bundles containing all dependencies. Share OS kernel while isolating processes in each container. | (Docker commands, for reference) `docker build -t [image:tag] .` `docker run -p 80:80 [image:tag]` |
Container Registry | A storage system for container images (public or private). GCP provides Artifact Registry or Container Registry. | `gcloud artifacts repositories create ...` `gcloud container images list` (for older Container Registry usage) |
Dockerfile Layers | Each line in a Dockerfile forms a new read-only layer. Final container = all layers + top read-write layer at runtime. | N/A in GCP CLI, but essential for Docker |
Pods | In pure Kubernetes, 1+ containers in a single deployable object. GKE always runs containers inside pods. | `kubectl run myapp --image=...` (auto-creates a single-pod Deployment in newer K8s versions) |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Kubernetes | An open-source container orchestration platform. Automates scheduling, scaling, networking for containerized apps. | `kubectl` commands to interact with any Kubernetes cluster. |
GKE (Google Kubernetes) | Managed environment on GCP for Kubernetes clusters. GCP manages the control plane (master) while you control node configs. | `gcloud container clusters create [CLUSTER_NAME] --zone ...` `gcloud container clusters list` `kubectl` for cluster interactions. |
Control Plane | Composed of API Server (kube-apiserver), Scheduler (kube-scheduler), Controller Manager (kube-controller-manager), etcd. Coordinates cluster state. GKE manages these for you. | GKE auto-manages control plane, no direct `gcloud` to manage it. |
Nodes | Worker machines (Compute Engine VMs in GKE). Run container runtime (Docker/Containerd) + kubelet (agent). | Created automatically by `gcloud container clusters create`. |
Node Pools | Group of nodes sharing configuration (machine type, size, disk, etc.). You can have multiple node pools in a cluster for different workloads. | `gcloud container node-pools create [POOL_NAME] --cluster [CLUSTER_NAME]` `gcloud container node-pools list --cluster [CLUSTER_NAME]` |
Namespaces | Virtual clusters within a physical cluster. Isolate apps or teams. Pre-defined namespaces: default, kube-system, kube-public, kube-node-lease. | `kubectl get namespaces` `kubectl create namespace [NAME]` `kubectl apply -f myapp.yaml --namespace [NAME]` |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Cluster Types | - Zonal (single-zone or multi-zonal) - Regional (multi-zone control plane replicas) Multi-zonal & regional = better availability, typically higher cost. |
`gcloud container clusters create [CLUSTER_NAME] --region/us-east1 --enable-autorepair` `gcloud container clusters create ... --num-nodes=...` |
Private Clusters | Nodes do not have public IPs, only internal addresses. Control-plane can optionally disable public endpoint. More secure, more steps for connectivity (VPC peering, NAT if needed). | `gcloud container clusters create [CLUSTER_NAME] --enable-private-nodes --master-ipv4-cidr ...` `gcloud container clusters update ... --enable-master-authorized-networks ...` |
Release Channels | Automatic cluster version upgrades, stability tiers. Rapid, Regular, Stable. | `gcloud container clusters create [CLUSTER_NAME] --release-channel=regular` `gcloud container clusters update [CLUSTER_NAME] --release-channel=stable` |
Auto-Upgrades | GKE can automatically upgrade control plane & nodes to newer patch versions. Minimizes manual overhead. | `gcloud container clusters create [CLUSTER_NAME] --enable-autoupgrade` `gcloud container node-pools update [POOL_NAME] --enable-autoupgrade --cluster=[CLUSTER_NAME]` |
Manual Upgrades | You can pin cluster version & manually do `gcloud container clusters upgrade ...`. Only recommended if you have custom reasons to test each version. | `gcloud container clusters upgrade [CLUSTER_NAME] --cluster-version=...` `gcloud container node-pools upgrade [POOL_NAME] --cluster [CLUSTER_NAME]` |
Surge Upgrades | Controls how many nodes GKE upgrades in parallel (max-surge-upgrade) and how many can be temporarily unavailable (max-unavailable-upgrade). Reduces downtime at cost of extra nodes. | `gcloud container node-pools update [POOL_NAME] --cluster=[CLUSTER_NAME] --max-surge-upgrade=2 --max-unavailable-upgrade=1` |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Pods | Smallest K8s object. Runs 1 or more containers. Ephemeral & disposable. Typically created/managed by higher-level objects like Deployment. | `kubectl get pods` `kubectl describe pod [NAME]` `kubectl logs [POD_NAME]` |
Pod Spec & Status | The .spec in the manifest states container specs (image, ports, volumes, etc.). The .status is updated by the K8s system. | `kubectl apply -f pod.yaml` |
Deployments | Higher-level object that manages sets of replicated Pods (ReplicaSets). Handles rolling updates, rollback, scale. Great for stateless apps. | `kubectl create deployment [NAME] --image=...` `kubectl scale deployment [NAME] --replicas=...` `kubectl rollout undo deployment [NAME]` |
ReplicaSets | Ensures a specified number of pod replicas are running. Usually handled by Deployment. | Typically not managed directly; used behind the scenes by Deployments. |
StatefulSet | For stateful apps requiring persistent identity (like DBs). Retains pod identity across rescheduling. | `kubectl apply -f statefulset.yaml` |
DaemonSet | Ensures a pod runs on every node (logging/monitoring agents). | `kubectl apply -f daemonset.yaml` |
Jobs & CronJobs | - Job runs a finite task to completion (batch work). - CronJob is a scheduled repeating job. |
`kubectl create job [NAME] --image=... -- [args]` `kubectl create cronjob [NAME] --schedule="0 * * * *" --image=... -- [args]` |
ConfigMaps & Secrets | Externalize config data or secret data. Mounted as env vars or volumes. Secrets are base64-encoded. | `kubectl create configmap [NAME] --from-literal=KEY=VALUE` `kubectl create secret generic [NAME] --from-literal=KEY=VALUE` |
Config Connector | A Kubernetes-based tool that manages Google Cloud resources using Kubernetes configuration files. |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Service | Stable, persistent endpoint to access a set of pods. Each service gets a stable IP + DNS name (internal or external). Pods behind the service are dynamically updated using selectors (labels). | - kubectl get services - kubectl describe service [NAME] - kubectl expose deployment [DEPLOYMENT_NAME] --type=... --port=... |
Selector & Labels | Services route traffic to pods matching a label. Example: app: inventory. A label must match the service selector to register the pod behind that service. | - kubectl apply -f service.yaml (the spec.selector must match pods’ metadata.labels) |
ClusterIP (default) | Internal-only (virtual IP) accessible within the cluster. No external exposure. | - kubectl expose deployment [NAME] --type=ClusterIP --port=... |
NodePort | Exposes the service on a static port on each node (30000–32767). Access from outside by |
- kubectl expose deployment [NAME] --type=NodePort --port=80 --node-port=30080 |
LoadBalancer | Provisions a cloud LB (e.g., GCP external load balancer). Traffic from LB -> NodePort -> Pods. Simplest way to get external IP if you want each service behind a separate LB. | - kubectl expose deployment [NAME] --type=LoadBalancer --port=80 |
Multi-Port Services | Service can map multiple ports (or port + targetPort pairs). Each port must have a unique name in the service spec. | - kubectl apply -f multiport-service.yaml |
ExternalName | Maps service DNS to an external DNS name. No cluster IP or pods. Simple CNAME-like alias. | - kubectl apply -f externalname.yaml |
Headless Service | spec.clusterIP: None. No clusterIP assigned. Allows direct pod endpoints discovery, often with StatefulSets. | - kubectl apply -f headless-service.yaml |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Ingress | High-level object that defines HTTP(S) routing rules for multiple Services. GKE implements Ingress via GCP’s HTTP/HTTPS Load Balancer. One IP can serve multiple paths/hosts. | - kubectl apply -f ingress.yaml - kubectl get ingress |
Ingress Controller | In GKE, the built-in controller maps Ingress resources to a Google Cloud HTTP(S) LB. | - kubectl describe ingress [NAME] |
Ingress Rules | Map host/path -> backend service in cluster. Example: /discontinued routes to Service discontinued-service. | - In ingress.yaml, under .spec.rules[].http.paths[].backend.serviceName or backend.service. |
NEG (Network Endpoint Group) | Container-native LB: each pod is an endpoint. LB routes traffic directly to pod IP. More fine-grained than standard Service NodePort. | - kubectl expose deployment [NAME] --port=80 --type=ClusterIP -o yaml --dry-run=client and annotate for NEGs. |
GCP SSL Certificates | Ingress can reference either Google-managed or self-managed SSL certificates. One LB can hold multiple certs for SNI-based routing. | - gcloud compute ssl-certificates create [CERT_NAME] --certificate [CERT_FILE] --private-key [KEY_FILE] - Then kubectl annotate ingress [NAME] ... |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Ephemeral vs. Durable Storage | - Ephemeral: Tied to pod lifecycle (e.g., emptyDir ).- Durable: Outlives pods, typically Persistent Disks or volumes from a storage provider. |
- Pods reference ephemeral volumes (like emptyDir ) in their manifest under .spec.volumes[].emptyDir .- Durable volumes are typically mounted via Persistent Volume Claims. |
Kubernetes Volume | Basic storage unit in a pod. Volumes outlive containers but die with the pod (unless using persistent volumes). Examples: emptyDir , configMap , secret , downwardAPI . |
- kubectl describe pod [NAME] shows volumes defined in .spec.volumes .- kubectl get pvc etc. |
Persistent Volume (PV) | A cluster-wide resource representing a piece of durable storage in the cluster. The actual backing can be GCE persistent disks, Filestore, etc. Lifecycle managed by K8s. | - kubectl get pv - Usually created dynamically via Persistent Volume Claims + StorageClass. |
Persistent Volume Claim (PVC) | A request for storage by a user. Binds to a suitable PV that meets the spec (e.g., size, access modes). GKE can dynamically create a persistent disk upon PVC creation if using the default or a custom StorageClass. | - kubectl get pvc - kubectl apply -f pvc.yaml Example snippet in pvc.yaml :yaml |
StorageClass | Defines classes of storage offered in a cluster. GKE typically has a default standard class (and possibly balanced / ssd ). Allows dynamic provisioning of persistent disks. |
- kubectl get storageclass - kubectl describe storageclass [CLASSNAME] - kubectl apply -f custom-storageclass.yaml |
Access Modes | Describes how volumes can be mounted: ReadWriteOnce (RWO), ReadOnlyMany (ROX), ReadWriteMany (RWX). | - Specified in pvc.yaml under .spec.accessModes . |
Regional vs. Zonal Persistent | - Zonal PD: Resides in a single zone. - Regional PD: Replicated across two zones in the same region, for higher availability (failover if zone fails). |
- Create a PVC with annotation referencing volume-type: pd-standard or custom SC referencing replication. |
Container-Native Storage | You can use GCE persistent disks, Filestore (NFS), or Cloud Storage FUSE with GKE. Cloud SQL can also be used externally. Simplest approach is persistent disks via PVC/StorageClass. | - GCE PD is automatically used if you choose storageClassName: standard in your PVC.- Filestore requires an NFS-based approach or Filestore CSI driver. |
Concept | Definition / Explanation | Key CLI Commands |
---|---|---|
Creating a GKE Cluster | GKE managed environment for Kubernetes. You can create via the Console or via gcloud container clusters create . You define cluster type (zonal or regional), node machine types, release channels, etc. |
- Console: "Kubernetes Engine" → "Create Cluster" → Fill details. - CLI: gcloud container clusters create [CLUSTER_NAME] --num-nodes=3 --zone=[ZONE] --release-channel=regular ... |
Node Pools | Clusters have node pools. Each pool is a group of node VMs sharing the same configuration. You can add, remove, or update node pools without affecting the entire cluster. | - gcloud container node-pools create [POOL_NAME] --cluster [CLUSTER_NAME] ... --num-nodes=2 - gcloud container node-pools delete [POOL_NAME] --cluster [CLUSTER_NAME] |
Setting kubectl |
Must retrieve the cluster’s credentials so kubectl can communicate with the new cluster’s control plane. |
- gcloud container clusters get-credentials [CLUSTER_NAME] --zone [ZONE] - Check: kubectl get nodes , kubectl get all |
Deploying a Container | - Option 1: Use kubectl create deployment ... --image=... - Option 2: Use GKE console "Deploy". Then "Expose" to create Service (Load Balancer, etc.). |
- kubectl create deployment box-of-bowties --image=gcr.io/[PROJECT]/box-of-bowties:v1.0.0 - kubectl expose deployment box-of-bowties --type=LoadBalancer --port=80 |
Scaling a Deployment | Increase/Decrease the number of pods (replicas). For zero downtime, K8s performs rolling expansion/contraction. | - Console: Workloads → "Scale". - CLI: kubectl scale deployment box-of-bowties --replicas=3 - kubectl get pods (verify) |
Rolling Updates | Seamless updates. Replaces old pods with new pods, one by one. Minimizes downtime. | - Console: "Workloads" → "Rolling Update" → Provide new container image digest. - CLI: kubectl set image deployment/box-of-bowties box-of-bowties-container=gcr.io/[PROJECT]/box-of-bowties:v1.0.1 --record |
Cloud Build & Container Reg. | - Cloud Build: CI/CD service. Build Docker images from source inside GCP, push to registry. - Container Registry: Stores Docker images. GCR is integrated with GCP auth + scanning. |
- gcloud builds submit --tag gcr.io/[PROJECT]/box-of-bowties:v1.0.0 . - gcloud container images list-tags gcr.io/[PROJECT]/box-of-bowties - gcloud container images delete gcr.io/[PROJECT]/box-of-bowties:v1.0.0 (cleanup) |
Cleanup | Delete resources to avoid costs: 1. Delete the LB Service 2. Delete the Deployment 3. Delete the Container Images 4. Delete GCS build artifacts 5. Delete GKE cluster. |
- LB/Service: kubectl delete service box-of-bowties-service - Deployment: kubectl delete deployment box-of-bowties - Cluster: gcloud container clusters delete [CLUSTER_NAME] --zone [ZONE] - Images: gcloud container images delete ... |
Concept | Description |
---|---|
Purpose / Use Cases | - Securely connect an on-premises network to a VPC over an IPsec VPN tunnel. - Good if you have moderate traffic, want encryption, and can tolerate latencies of public internet. - Site-to-site only (no client VPN). |
Key Features | - Encryption at L3 (IPsec). - HA VPN offers 99.99% SLA (two interfaces/two external IPs). - Classic VPN offers 99.9% SLA. (Google recommends new deployments to use HA VPN.) |
Routing | - Static or dynamic routing supported (dynamic with BGP/Cloud Router). - Each HA VPN gateway interface can support multiple tunnels. - Speeds up to ~3 Gbps per tunnel. |
Connectivity | - Traffic traverses public internet—but is IPsec-encrypted. - Combine with Private Google Access for on-prem hosts. |
Classic vs. HA VPN | - Classic VPN: Single IP, single interface, up to 3 Gbps, 99.9% SLA. - HA VPN: Two IPs (active/active), 99.99% SLA if both interfaces used with two external IPs, dynamic routing only. |
Concept | Description |
---|---|
Purpose / Use Cases | - Dedicated private connection from on-prem data center to Google’s network (no public internet). - High throughput, low latency. Great for large data volumes and production workloads needing stable connectivity. |
Dedicated Interconnect | - Physical link (10 Gbps or 100 Gbps) from on-prem to Google’s colocation facility (PoP). - Up to 200 Gbps total per interconnect. - Must be in a Google-supported colocation facility. - Offers private IP routing. |
Partner Interconnect | - Connect via a service provider instead of direct facility for “last-mile” connectivity. - Supports smaller increments: 50 Mbps up to 50 Gbps attachments. - Still private IP traffic, leveraging partner’s physical link. |
Cloud Router + BGP | - For dynamic routing, a Cloud Router is used with (HA) VPN or Interconnect. - BGP sessions exchange routes between on-premises network and GCP VPC. |
Direct vs. Partner | - Dedicated if you already have presence in colocation facility, need 10–100 Gbps per link. - Partner if you can’t reach a colocation PoP or only need smaller capacity. |
Aspect | Description |
---|---|
Definition | Fully-managed PaaS for hosting web apps in Google Cloud. Handles provisioning, scaling, patching. Just upload your code and let GCP do the heavy lifting. |
Standard vs. Flexible | Standard: Runs in language-specific runtimes (Python, Node.js, Java, Go, etc.). Sandboxed environment, free tier available, ephemeral local disk. Flexible: Runs in Docker containers on GCE VMs, no free tier, uses OS-level access. |
Scaling Types | - Automatic: Scale up/down based on load (can go to zero). - Basic: Instances start on request, shut down when idle. Good for intermittent workloads. - Manual: Specify fixed number of instances. |
Services / Versions | - An App Engine app can have multiple services (like microservices). - Each service can have multiple versions (for rollbacks, testing, traffic splitting). |
Traffic Management | - Traffic Migration: Move traffic from old version to new version immediately or gradually (standard environment only). - Traffic Splitting: Route percentages of traffic to each version (A/B test). |
Deploying | - Typically: gcloud app deploy [YOUR_APP_YAML] - Distinct app.yaml per service. |
Supported Languages | Common runtimes (Node.js, Python, Java, Go, PHP, Ruby, .NET). Custom Docker (flex environment). |
Aspect | Description |
---|---|
Definition | Serverless “function as a service” for single-purpose, event-driven code. GCP automatically handles provisioning, scaling, patching. |
Key Features | - Supports Node.js, Python, Go, Java, .NET Core. - Integrations with HTTP triggers, or background triggers (Pub/Sub, Cloud Storage, Firestore, etc.). - Priced by execution time and invocations. |
Execution Model | - Stateless: each invocation is handled by an instance of your function. - Single concurrency per instance, no parallel requests on same instance. |
Triggers | HTTP (direct calls), Pub/Sub (event), Cloud Storage (file uploads/deletes), Firestore (document changes). |
Deployment | - gcloud functions deploy [FUNCTION_NAME] --runtime [LANGUAGE] --trigger-http / --trigger-bucket / --trigger-topic... - Source code can be inline in the console or uploaded from local/Cloud Source Repos. |
Networking | - By default: outgoing to internet is allowed, internal VPC not allowed unless you configure a VPC connector. - Ingress control can restrict function access to internal only or LB only. |
Use Cases | - Quick data transformations, e.g. image thumbnail creation on file upload. - Asynchronous event handlers, e.g. after a Pub/Sub message. - Serverless APIs, e.g. for webhooks or form submissions. |
Aspect | Description |
---|---|
Definition | Global, large-capacity object storage for unstructured data. Store files/objects in buckets with globally unique names. |
Use Cases | - Storing large data sets (e.g., images, videos, archives). - Content distribution / direct public hosting. - Backup or big data analytics source. - Serving static website content. |
Buckets | - Top-level container for objects (no nesting buckets). - Name must be globally unique. - Choose location (region, dual-region, multi-region). - Choose default storage class (Standard, Nearline, Coldline, Archive). |
Objects | - Stored files in buckets, up to TBs in size. - Immutable: replaces old version, cannot edit in place. - Metadata includes object’s name, generation, etc. - No limit on number of objects. |
Storage Classes | - Standard: Frequent access, ~0.02 USD/GB/mo. - Nearline (30-day min storage): Infrequent (~1x/mo) usage, ~0.01 USD/GB/mo. - Coldline (90-day min): Rarely accessed (~1x/quarter). - Archive (365-day min): ~1x/year or long-term. |
Geo-Options | - Region (lowest-latency to your region). - Dual-Region (2 separate regions for HA). - Multi-Region (spreads data across a continent). |
Access Control | - IAM (recommended) to manage bucket- or project-level permissions. - ACLs (fine-grained object-level control, older approach). - Signed URLs for temporary controlled access. - Signed Policy Docs for controlled uploads. |
Lifecycle Management | - Automatically transition storage class (e.g., Standard→Coldline) or delete older objects. - Configured via JSON or console rules (conditions + actions). |
Object Versioning | - Store older (noncurrent) versions instead of overwriting. - Increases storage cost, often combined with lifecycle rules (delete older versions after N days). |
Typical Commands | - gsutil cp : Copy local→GCS or GCS→GCS.- gsutil mv : Move objects, changing generation number if versioning enabled.- gsutil lifecycle set/get : Manage JSON-based lifecycle rules. |
Aspect | Description |
---|---|
Definition | Fully managed relational DB service. Supports MySQL, PostgreSQL, and SQL Server. Google handles provisioning, maintenance, backups, HA config, etc. |
Storage & Scaling | - Up to 30 TB persistent disk per instance. - Choose HDD or SSD. - Automatic storage increase if enabled. - CPU, RAM sized by instance (db-* machine types). |
Connectivity | - Public IP (with authorized networks) or Private IP (preferred if on same VPC). - Cloud SQL Proxy recommended (handles SSL/tunnels + IAM-based auth). |
Replication | - Read Replicas (for scale-out reads) up to 10 replicas. - Cross-region or in-region replicas; can replicate to external MySQL. - Promote replica → new standalone primary (no auto failover). |
High Availability | - Optionally enable HA (known as “regional” instance). - Creates synchronous standby in different zone, automatic failover → 99.95+% (varies by tier). |
Backups & PITR | - Automated or on-demand backups. - Point-in-time recovery requires binary logging (must be enabled). - By default 7 days of backup retained (configurable). |
Use Cases | - Traditional relational workloads needing strong ACID transactions. - Commonly used with external VM apps, GKE microservices, or serverless (using Cloud SQL Proxy). |
Cost | - Billed for CPU, memory, storage, backups, egress. - Different pricing for MySQL/Postgres vs. SQL Server (license included). |
Aspect | Description |
---|---|
Definition | Google’s horizontal-scaling relational DB. Global, strongly consistent, highly available. 5 nines availability for multi-region. |
Key Features |
- Relational model (SQL interface, schemas). - Synchronous replication for strong consistency. - Auto-sharding and high throughput with “TrueTime” for global ordering. - Nodes are the capacity unit (CPU/RAM), can scale on the fly. |
Use Cases |
- Mission-critical, globally distributed systems needing ACID transactions at scale. - Multi-region or global apps, high throughput (10k QPS+). - e.g. financial trading, global inventory, gaming leaderboards with strong consistency. |
Replication & Regions |
- Multi-region = 2 or more regions + witness for 5 nines SLA. - Regional instance = 1 region, multiple zones, 4 nines SLA. |
Cost |
- ~0.90 USD/node/hr + storage (~0.30 USD/GB/mo). - Nodes provide CPU/RAM, can be scaled linearly. |
Aspect | Description |
---|---|
Definition | Fully managed wide-column NoSQL database for very large scale (TB–PB) with low latency and high throughput. |
Key Features |
- Horizontally scalable (add more nodes for higher throughput). - Millisecond-level read/write latencies at large scale. - Integrated with Big Data / ML tools (Dataflow, Dataproc, HBase API). - Regional service; can enable multi-cluster for DR. |
Common Use Cases |
- Time-series data (IoT, logs, sensor readings). - Ad tech or financial data ingest at massive scale. - Recommendation engines, personalization, real-time analytics. |
Cost |
- ~0.65 USD/node/hr + storage usage + egress. - Not cheap, but extremely high performance at scale. |
Aspect | Description |
---|---|
Definition | Document-based NoSQL database with automatic scaling, high performance, and SQL-like queries (GQL). |
Datastore / Firestore |
- Firestore is the next generation of Datastore. Existing Datastore DBs are automatically migrated. - Firestore in Datastore Mode = Datastore’s system with Firestore’s improved backend. |
Key Features |
- ACID transactions (document-level). - Automatic scaling, strongly consistent queries by key. - GQL for queries. - Emulator available for local dev and testing. |
Use Cases | - Web/mobile user profiles, product catalogs, real-time data that needs simpler query patterns than relational. |
Aspect | Description |
---|---|
Definition | Serverless document DB for mobile/web app dev, real-time sync, offline support. |
Key Features |
- Data in collections → documents → subcollections. - Real-time updates & offline mode for client apps. - Integrates with Firebase for mobile dev. - Automatic multi-region replication, 5 nines availability. |
Common Use Cases |
- Mobile/web backends with real-time sync (chat, presence, user preferences). - Offline mode, frequently changing data. |
Aspect | Description |
---|---|
Definition | Fully managed in-memory data store (Redis or Memcached). Use as an application cache for high throughput & low latency data retrieval. |
Key Features |
- Zero server ops (scalable, self-healing). - Deployed in VPC, private IP only by default. - High availability & failover for Redis “Standard Tier.” - Great for session caching, caching frequently accessed queries, ephemeral data, etc. |
Common Use Cases |
- Session caching for web apps. - Leaderboards, real-time counters. - Low-latency read access to data typically stored in slower or remote DB. |
Service | Description | Transfer Mode | Target GCP Service |
---|---|---|---|
Storage Transfer Service | A fully managed online service that automates the transfer of data from external cloud storage providers or on-premises sources into Google Cloud Storage. | Online | Cloud Storage |
Transfer Appliance | A secure, physical hardware device designed for moving very large volumes of data to Google Cloud Storage. It is shipped to the customer, loaded with data, then returned for ingestion. | Offline | Cloud Storage |
BigQuery Data Transfer Service | A service that automates the movement of data from various external sources directly into BigQuery, helping to keep analytics data up to date. | Online | BigQuery |
Service | Type | Description | Use Cases |
---|---|---|---|
BigQuery | Data Warehouse | Fully managed, serverless data warehouse for real-time analytics using SQL. Supports batch and streaming data ingestion. | Business analytics, BI reporting, ML integration |
Pub/Sub | Messaging Service | Global, scalable messaging middleware for real-time event streaming. Publishers send messages to topics, and subscribers pull/push messages. | IoT data streams, event-driven systems, log ingestion |
Composer | Workflow Orchestration | Managed Apache Airflow service for ETL and data pipelines. Uses DAGs (Directed Acyclic Graphs) to define workflows. | Data pipelines, workflow automation |
Dataflow | Data Processing | Serverless, fully managed streaming and batch data processing using Apache Beam. | ETL, real-time data analytics, event stream processing |
Dataproc | Hadoop/Spark Clusters | Managed Hadoop, Spark, Hive, and Pig clusters. Easy to spin up/down clusters for temporary workloads. | Data lakes, Spark/MapReduce jobs, big data batch processing |
Cloud Datalab | Data Science IDE | Interactive Jupyter notebook-based environment for data exploration, analysis, and ML model development. | Data exploration, visualization, ML prototyping |
Dataprep | Data Cleaning | Serverless, visual tool for exploring, cleaning, and preparing data for analysis or ML. Auto-detects anomalies and outliers. | Data wrangling before feeding into BigQuery or ML models |
Service | Category | Description | Use Cases |
---|---|---|---|
AI Platform (Vertex AI) | ML Lifecycle Platform | Unified ML platform to train, deploy, and manage models. Supports TensorFlow, Scikit-learn, XGBoost, and more. | End-to-end ML model lifecycle |
BigQuery ML | ML in BigQuery | Run ML models directly inside BigQuery using SQL syntax. No need to move data. | Predictive analytics, forecasting |
AutoML | No-Code ML | Build custom ML models (vision, NLP, translation, tables) without needing deep ML knowledge. | Domain-specific custom models |
API | Category | Capabilities | Use Cases |
---|---|---|---|
Vision API | Image Analysis | Detect objects, faces, landmarks, and labels in images. OCR capabilities included. | Image moderation, product search |
Video Intelligence API | Video Analysis | Detect objects, activities, and speech in videos. Supports video annotation and scene change detection. | Video content tagging, surveillance |
Natural Language API | Text Analysis | Entity recognition, sentiment analysis, syntax analysis, and content classification. | Chatbots, document analysis |
Translation API | Language Translation | Translate text between 100+ languages. Supports glossary for domain-specific terminology. | Multilingual apps, e-commerce |
Speech-to-Text API | Speech Recognition | Converts spoken language into text in real-time. Supports multiple languages and noise robustness. | Voice commands, call center analytics |
Text-to-Speech API | Speech Synthesis | Converts text into natural-sounding speech. Supports 100+ voices in 20+ languages. | IVR systems, virtual assistants |
Dialogflow | Conversational AI | Build chatbots and voice bots with natural language understanding. Supports voice/text integration with Google Assistant. | Customer support bots, virtual agents |
Tool | Category | Purpose | Key Features |
---|---|---|---|
Cloud Monitoring | Metrics & Dashboards | Visualize resource health, create dashboards, set alerting policies, and monitor metrics across cloud services and VMs. | Uptime checks, multi-project monitoring, custom alerts |
Cloud Logging | Log Aggregation | Collects logs from GCP services, VMs, and on-prem systems. Allows log-based metrics and integrates with Monitoring. Logs are stored in a log bucket. Can be routed using a log sink to BQ/Pub-sub/Cloud Storage.. | Log querying, export to BigQuery or Storage. Use Ops Agent to get logs from VMs. |
Error Reporting | Error Aggregation | Real-time error detection and aggregation. Automatically groups similar errors and tracks frequency and impact. | Language support (Go, Java, Python, Node.js, etc.) |
Debugger | Live Debugging | Debug production apps without stopping them. Set breakpoints and log points to inspect variables and stack traces. | Zero-downtime debugging, GitHub/GitLab integration |
Trace | Performance Tracing | Analyze app latency and request traces. Helps identify bottlenecks in microservices or API requests. | Distributed tracing, end-to-end latency insights |
Profiler | CPU & Memory Profiler | Continuously analyzes resource usage (CPU, memory) to optimize app performance. | Detect performance bottlenecks, low overhead profiling |
Use Case | Solution |
---|---|
Track CPU/memory usage of VMs | Cloud Monitoring + Ops Agent |
Create alert on GKE pod crashes | Cloud Monitoring Alerts |
Detect high error rates in app | Error Reporting + Cloud Logging |
Identify slow API requests | Cloud Trace |
Optimize app performance | Cloud Profiler |
Aggregate logs across services | Cloud Logging with Log-Based Metrics |
Trigger alerts based on logs | Log-based Metrics + Cloud Monitoring |